Cross-language phoneme mapping for phonetic search keyword spotting using multiple source languages

نویسندگان

  • Ella Tetariy
  • Yossi Bar-Yosef
  • Michal Gishri
  • Ruthi Alon-Lavi
  • Vered Aharonson
  • Irit Opher
  • Ami Moyal
چکیده

Performing Phonetic Search Keyword Spotting (PS KWS) in new languages when language resources are scarce is an interesting and challenging task. In a previous paper we reported a methodology that enabled PS KWS under these conditions utilizing cross-language phoneme mappings from another sufficiently resourced and well-trained source language. We performed phoneme recognition in the new target language with the acoustic model of the source language. The keyword search was performed over a phoneme lattice of the target language phonemes following a mapping from one language to the other. In the present work we extend this method and its capabilities by mapping two source language phoneme sets into one target language set and performing a combined lattice search. Testing the technique on English and Arabic as source languages yielded a 50% Detection Rate (DR) and a False Alarm Rate (FAR measured in number of false alarms per hour per keyword) of 2 when Spanish was the target language, a DR of 36% and FAR of 4 when Dari was the target language and a DR of 35% and FAR of 6 with Farsi as the target language. These results indicate that combining two source languages is better than using a single language since the acoustic space is better represented. Searching in a combined lattice while employing adequate phoneme transformations significantly improves performance. Such a system can be used as an initial version of a PS KWS system in a new language when sufficient language resources are not available.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Subword and phonetic search for detecting out-of-vocabulary keywords

We compare several approaches, separately and together, for spotting of out-of-vocabulary (OOV) keywords, in terms of their ATWV scores. We considered three types of recognition units (whole words, syllables, and subwords of different lengths) and two basic search strategies (whole-unit, fuzzy phonetic search). In all cases, the search was performed by collapsing the recognition lattice into a ...

متن کامل

Phoneme-based Statistical Transliteration of Foreign Names for OOV Problem

Given a source language term, machine transliteration is to automatically generate the phonetic equivalents in a target language. It is useful in many cross language applications. Recently, there are increasing concerns about automatic transliteration, especially with languages with significant distinctions in their phonetic representations, e.g. English and Chinese. Despite many cross-language...

متن کامل

Cross Lingual Modelling Experiments for Indonesian

The extension of Large Vocabulary Continuous Speech Recognition (LVCSR) to resource poor languages such as Indonesian is hindered by the lack of transcribed acoustic data and appropriate pronunciation lexicons. Research has generally been directed toward establishing robust cross-lingual acoustic models, with the assumption that phonetic lexicons are readily available. This is not the case for ...

متن کامل

Spoken cross-language access to image collection via captions

This paper presents a framework of using Chinese speech to access images via English captions. The formulation and the structure mapping rules of Chinese and English named entities are extracted from an NICT foreign location name corpus. For a named location, name part and keyword part are usually transliterated and translated, respectively. Keyword spotting identifies the keyword from speech q...

متن کامل

Training Acoustic Models with Speech Data from Different Languages

We present a technique to train acoustic models for a target language using speech data from distinct source languages. In this approach, no native training data from the target language is required. The acoustic model candidates for each targetlanguage phoneme are automatically selected from a group of existing source languages by means of a combined phoneticphonological (CPP) metric, develope...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Artif. Intell. Research

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2016